Fast Lightweight Suffix Array Construction and Checking
نویسندگان
چکیده
We describe an algorithm that, for any v ∈ [2, n], constructs the suffix array of a string of length n in O(vn+ n logn) time using O(v + n/√v) space in addition to the input (the string) and the output (the suffix array). By setting v = log n, we obtain an O(n logn) time algorithm using O(n/√logn) extra space. This solves the open problem stated by Manzini and Ferragina [ESA ’02] of whether there exists a lightweight (sublinear extra space) O(n logn) time algorithm. The key idea of the algorithm is to first sort a sample of suffixes chosen using mathematical constructs called difference covers. The algorithm is not only lightweight but also fast in practice as demonstrated by experiments. Additionally, we describe fast and lightweight suffix array checkers, i.e., algorithms that check the correctness of a suffix array.
منابع مشابه
Fast and Lightweight LCP-Array Construction Algorithms
The suffix tree is a very important data structure in string processing, but it suffers from a huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three (compressed) components: the suffix array, the LCP-array, and data structures for simulating navigational operations on the suffix tree. The LCP-array stores the leng...
متن کاملLightweight LCP-Array Construction in Linear Time
The suffix tree is a very important data structure in string processing, but it suffers from a huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three (compressed) components: the suffix array, the LCP-array, and data structures for simulating navigational operations on the suffix tree. The LCP-array stores the leng...
متن کاملLightweight Parameterized Suffix Array Construction
We present a first algorithm for direct construction of parameterized suffix arrays and parameterized longest common prefix arrays for non-binary strings. Experimental results show that our algorithm is much faster than näıve methods.
متن کاملAn Incomplex Algorithm for Fast Suffix Array Construction
Our aim is to provide full text indexing data structures and algorithms for universal usage in text indexing. We present a practical algorithm for suffix array construction. The fundamental algorithm is less complex than other construction algorithms. We achieve very fast construction times for common strings as well as for worst case strings by enhancing our basic algorithms with further techn...
متن کاملEngineering a Lightweight External Memory Suffix Array Construction Algorithm
We describe an external memory su x array construction algorithm based on constructing su x arrays for blocks of text and merging them into the full su x array. The basic idea goes back over 20 years and there has been a couple of later improvements, but we describe several further improvements that make the algorithm much faster. In particular, we reduce the I/O volume of the algorithm by a fa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003